Privacy Preservation through Data Generation

نویسندگان

  • Jilles Vreeken
  • Matthijs van Leeuwen
  • Arno Siebes
چکیده

Many databases will or can not be disclosed without strong guarantees that no sensitive information can be extracted. To address this concern several data perturbation techniques have been proposed. However, it has been shown that either sensitive information can still be extracted from the perturbed data with little prior knowledge, or that many patterns are lost. In this paper we show that generating new data is an inherently safer alternative. We present a data generator based on the models obtained by the MDLbased KRIMP [18] algorithm. These are accurate representations of the data distributions and can thus be used to generate data with the same characteristics as the original data. Experimental results show a very large patternsimilarity between the generated and the original data, ensuring that viable conclusions can be drawn from the anonymised data. Furthermore, anonymity is guaranteed for suited databases and the quality–privacy trade-off can be balanced explicitly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey of Cryptographic and Non-cryptographic Techniques for Privacy Preservation

Cryptography is to become familiar with the requirement of large, complex, information rich data sets for it’s privacy preservation. The privacy preserving data mining has been generated; to go through the concept of privacy in data mining is hard. Several algorithms and approaches are being generated theoretically, but practically it is hard. Privacy in data mining can be achieved through seve...

متن کامل

Privacy-preserving Clustering of Data Streams

As most previous studies on privacy-preserving data mining placed specific importance on the security of massive amounts of data from a static database, consequently data undergoing privacy-preservation often leads to a decline in the accuracy of mining results. Furthermore, following by the rapid advancement of Internet and telecommunication technology, subsequently data types have transformed...

متن کامل

Analyzing the Privacy Preserving Using Big Data Techniques

Recently big data has become a hot research topic. The rising amounts of big data also increase the chance of violate the privacy of individuals. Since big data need high computational power and large storage, distributed systems are used. As multiple parties are concerned in these systems, the risk of privacy violation is improved. There have been a number of privacy-preserving methods develop...

متن کامل

Privacy Preserving Data Mining Using Additive Perturbation on Relational Streaming Data

Data mining concerns with extracting the required important data from the database and ignoring the rest. With the success of data mining, privacy preservation has also acquired the great importance. The new concept privacy preserving data mining PPDM, concerns with preserving the privacy of sensitive individuals data. In this paper, privacy of sensitive attribute data concerned with individual...

متن کامل

A review on Security in Distributed Information Sharing

In recent year’s privacy preserving data mining has emerged as a very active research area in data mining. Over the last few years this has naturally lead to a growing interest in security or privacy issues in data mining. More precisely, it became clear that discovering knowledge through a combination of different databases raises important security issues. Privacy preserving data mining is on...

متن کامل

A Privacy Preservation Framework for Big Data (Using Differential Privacy and Overlapped Slicing)

-We are in the midst of big data. The rate of data generation is increasing at a very rapid rate. We need to understand and analyze this data as quick as possible. A delay in millisecond to understand the data may cost not only money but also life. There are various processing and analytic mechanisms like Hadoop and MapReduce to process the data. But as big data comprises an enormous amount of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007